Word Sense Disambiguation using Static and Dynamic Sense Vectors
نویسندگان
چکیده
It is popular in WSD to use contextual information in training sense tagged data. Co-occurring words within a limited window-sized context support one sense among the semantically ambiguous ones of the word. This paper reports on word sense disambiguation of English words using static and dynamic sense vectors. First, context vectors are constructed using contextual words 1 in the training sense tagged data. Then, the words in the context vector are weighted with local density. Using the whole training sense tagged data, each sense of a target word2 is represented as a static sense vector in word space, which is the centroid of the context vectors. Then contextual noise is removed using a automatic selective sampling. A automatic selective sampling method use information retrieval technique, so as to enhance the discriminative power. In each test case, a automatic selective sampling method retrieves N relevant training samples to reduce noise. Using them, we construct another sense vectors for each sense of the target word. They are called dynamic sense vectors because they are changed according to a target word and its context. Finally, a word sense of a target word is determined using static and dynamic sense vectors. The English SENSEVAL test suit is used for this experimentation and our method produces relatively good results. 1 ‘Contextual words’ is defined as a list of content words in context. 2 In this paper, a target word ‘Wt’ is a semantically
منابع مشابه
Word Sense Disambiguation with Information Retrieval Technique
This paper reports on word sense disambiguation of Korean nouns with information retrieval technique. First, context vectors are constructed using contextual words in training data. Then, the words in the context vector are weighted with local density. Each sense of a target word is represented as ‘Static Sense Vector’ in word space, which is the centroid of the context vectors. Contextual nois...
متن کاملSupervised and Unsupervised Word Sense Disambiguation on Word Embedding Vectors of Unambigous Synonyms
This paper compares two approaches to word sense disambiguation using word embeddings trained on unambiguous synonyms. The first one is an unsupervised method based on computing log probability from sequences of word embedding vectors, taking into account ambiguous word senses and guessing correct sense from context. The second method is supervised. We use a multilayer neural network model to l...
متن کاملرفع ابهام معنایی واژگان مبهم فارسی با مدل موضوعی LDA
Word sense disambiguation is the task of identifying the correct sense for the word in a given context among a finite set of possible sense. In this paper a model for farsi word sense disambiguation is presented. The model use two group of features: first, all word and stop words around target word and topic models as second features. We extract topics from a farsi corpus with Latent Dirichlet ...
متن کاملWord Sense Disambiguation Using Vectors of Co-occurrence Information
This paper reports on the word sense disambiguation of Korean noun by using co-occurrence information in context. For a given noun, its local contextual word distribution is not enough to express their semantic characteristics for noun sense disambiguation. This paper proposes a cluster-based sense as a base vector. Contextual noise is removed by a term weighting method, and hypernyms of remain...
متن کاملFirst-order and second-order context representations: geometrical considerations and performance in word-sense disambiguation and discrimination
First-order and second-order context vectors (C and C) are two rival context representations used in word-sense disambiguation and other endeavours related to distributional semantics. C vectors record directly observable features of a context, whilst C vectors aggregate vectors themselves associated to the directly observable features of the context. Whilst C vectors may appeal on a number of ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2002